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ABSTRACT 

For investigating the relationship between the star formation rate and gas sur- 
face density, we develop a Bayesian linear regression method that rigorously treats 
measurement uncertainties and accounts for hierarchical data structure. The hier- 
archical Bayesian method simultaneously estimates the intercept, slope, and scatter 
■ about the regression line of each individual subject (e.g. a galaxy) and the popula- 

tion (e.g. an ensemble of galaxies). Using synthetic datasets, we demonstrate that the 
method accurately recovers the underlying parameters of both the individuals and the 
population, especially when compared to commonly employed ordinary least squares 
techniques, such as the bisector fit. We apply the hierarchical Bayesian method to es- 
ti mate the Kennicutt- Schmidt (KS) parameters of a sample of spiral galaxies compiled 
by iBigiel et al.l (|2008l) . We find significant variation in the KS parameters, indicating 
that no single KS relationship holds for all galaxies. This suggests that the relationship 
between molecular gas and star formation differs from galaxy to galaxy, possibly due 
to the influence of other physical properties within a given galaxy, such as metallicity, 
molecular gas fraction, stellar mass, and/or magnetic fields. In four of the seven galax- 
ies the slope estimates are sub-linear, especially for M51, where unity is excluded at 
the 2cr level. We estimate the mean index of the KS relationship for the population to 
be 0.84, with 2a range [0.63, 1.0]. For the galaxies with sub-linear KS relationships, 
a possible interpretation is that CO emission is tracing some molecular gas that is 
not directly associated with star formation. Equivalently, a sub-linear KS relationship 
may be indicative of an increasing gas depletion time at higher surface densities, as 
traced by CO emission. The hierarchical Bayesian method can account for all sources 
of uncertainties, including variations in the conversion of observed intensities to star 
formation rates and gas surface densities (e.g. the Xco factor), and is therefore well 
suited for a thorough statistical analysis of the KS relationship. 
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1 INTRODUCTION 

1.1 The Kennicutt-Schmidt relationship 

Accurately measuring the properties of star formation is 
crucial for understanding a variety of astrophysical top- 
ics, including the interstellar medium (ISM), galaxy struc- 
ture and dynamics, and the evolution of the universe as a 
whole. Observations have revealed strong power-law correla- 
tions between the star formation rate Esfr and gas surface 
densities in disk galaxies, now referred to as the Schmidt 
or Kennicutt-Schmidt (hereafter KS) law (ISchmidtl 1 19591; 
lKennicuttlll989l . QUI; iKennicutt fc Evans! [20121 '). Explain- 
ing this KS law has been a major priority in the star forma- 
tion research community (see e.g. lMac Low fc Klesserj|2004l ; 



iMcKee fc Ostrikerl 120071 ; IKennicutt fc Evans! 12013 . and ref- 
erences therein). 

Recent studies have indicated that much of the KS cor- 
relation is driven by the molecular component E mo i: 

Sspr = aS^ ol , (1) 



using either azi muthal averages or data from individua l 



sub-kpc regions JRownd & Younel 19991; Wong & Blitz 


2002; 


Hever et al.ll2004 IKennicutt et al. 2007; iBiriel et all 


2008, 


2011 


; Lerov et al.ll2008l; Rahman et al. 201ll; Schruba et al. 


2011 


). The molecular component is usually inferred through 



CO observations and traces much of the cold, dense gas in 
the ISM, because HI saturates above surface densities ~10 
Mq pc -2 . The range in power-law indices has been esti- 
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mated from ^ 0.9 to ^ 3 (see intro i n lBigiel et aljfeoost 
and review by iKennicutt fc E vans 2012), and depends on a 
variety of caveats, such as the observ ed scale, tracer, cal- 



ibration of star formation rates (e . g. Calzetti et al. 120071 : 



iRahman et~ai]|201ll ; ITiu et al.ll201ll ; llerov et al.ll2012T ). and 

a host of properties of the sourc e, such as the metallicity, gas 
fract i ons, and stellar mass (e.g. iLerov et al.ll2008l ; IShi et all 
l201ll ; ISaintonge et al.lljmh . 

Averaged over entire galaxy disks, IKennicutt! (| 19891 . 

I1998T ) found that N =1.4 ± 0.15 describes the KS relation- 
ship well for over five orders of mag n itude in total gas s urface 
den sity (see a l so iBuat et allll98a ). IKennicutt et all (120071 ) 
and lLiu et all (|201ll ) infer a similar power-law relationship 
on ~500-2000 pc scales in the spiral galaxy M51 (NGC 
5194). An interpretation for N=1.5 is that the dominant 
timescale is determined by gl obal gravita t ional instability 
l|Quirkll 19721 ; lKennicutt|[l989h . iBigiel et all (|2008T ) , hereafter 
B08, analyzed resolved observations of a sample of galaxies, 
and inferred an approximately linear molecular KS relation- 
ship at intermediate densities 10 Mq pc -2 < E mo i < 100 
Mq pc -2 and speculated about a super-linear relationship 
at higher densities. As discussed by B08 a linear relation- 
ship may be evidence of a constant molecular gas depletion 
time, and that extragalactic CO observations, often averag- 
ing emission over many square kpc, are simply "counting" 
molecular clou ds with relat i vely s imilar properties. For M51, 
both B08 and lBlanc et all (|2009h find a sub-linear KS rela- 
tionship. There are numerous differences in the observations 
and analysis, including reduction techniques, that may ac- 
count for the discrepancies between the estimated power-law 
indices. Thus, a comparison between the various observa- 
tional investigations is often not straightforward. We also 
caution that any analysis and conclusions, including those 
from our work here, are affected by the specific assumptions 
that go into data reduction or conversions from observables 
to physical quantities. However, with these caveats in mind, 
we will use the B08 data sets to illustrate the application of 
a hierarchical Bayesian methodQ 



1.2 Linear regression in astrophysics 

A central component of many observational investigations 
is the quantification of the correlation between two or more 
observed quantities. Typically, linear regression provides es- 
timates of the zero-point and slope of the "best-fit" regres- 
sion line between the observed data. In log-space, linear re- 
gression of the logarithm of the observed quantities provide 
estimates of the coefficient C and index N of a power-law 
y = Cx N . For example, linear regression is often employed 
to quantify the relationship between the linewidth and sizes 
of molecular clouds, the luminosity and kinematic velocity of 
galaxies (Tully-Fisher relationship), the X-ray spectral slope 
and Eddington ratios of quasars, the star formation rate and 
gas surface density in the ISM, and a host of other topics in 



1 We note that the calibrations for some of these data sets may 
have been revised in the meantime (e.g. Leroy et al 2012, submit- 
ted), though without changing the overall conclusions regarding 
the KS relationship. In a future paper we plan to expand this anal- 
ysis to include these most recent calibrations and larger galaxy 
samples. 



current astrophysical research. It is thus important to un- 
derstand the limitations of common fitting methods, and, 
whenever possible, develop accurate regression algorithms 
appropriate for the given problem. 

One common statistical method for fitting data is the 
ordinary least squares (OLS), or \ 2 fit- An OLS fit com- 
putes the best fi t line by minimizi ng the squared error in 
the residuals. As llsobe et all (|l990h show, the OLS fit can 
produce discrepant slope and intercept estimates depend- 
ing on the classification of the "dependent" and "indepen- 
dent" variables. An important limitation of the OLS method 
is that it does not account for measurement uncertainties 
in the regression, resulting in biased parameter estimate s 
dAkritas fc Bershadvlll996l ; IWeiner et al. 2006; K elly||2007n . 

Further, observational datasets are often intrinsically 
hierarchical, or structured, but most common statistical 
methods do not account for any such hierarchy. For example, 
consider Esfr and E mo i in a sample of observed galaxies. A 
given galaxy within the survey sample will contain a number 
of measurements of Esfr and E mo i. If a linear regression is 
employed on all measurements of Esfr and E mo i from all 
galaxies, then any galaxy-to-galaxy variation could be lost 
in the global EsFR-E mo i relationship. Moreover, the global 
or universal relationship estimated from a linear regression 
analysis on all data may be biased towards the trend from 
one or more galaxies, e.g. those with the largest number of 
Esfr— E mo i pairs, or, if noise is considered, to those galaxies 
with the highest signal-to-noise ratios and with the tightest 
Esfr— E mo i correlations. Here, we develop a fitting method 
for such hierarchical data, such that we estimate regression 
parameters for both the population, or group, as well as the 
individual relationship, allowing for an assessment of the 
differences between individuals within the group. 

In this work, we present a method which rigorously 
treats measurement uncertainties, as well as estimates the 
scatter about the fit regression lines, two aspects which can 
naturally be handled in a hierarchical framework. Bayesian 
methods are well suite d for treating prob l ems with hier- 
archi c al structure (e.g . iGelman et all 12004: Gcl man fc Hilll 
120071 ; H ruse hkel 1201 il l Moreover, measurement uncertain- 
ties can be rigorously treated at each level in the hierar- 
chy, such that the estimated parameters fully account for 
any source of uncertainty in the modelling. Bayesia n meth- 
ods a re becoming more common in astrophysics (|Loredd 
2012), and have been employed for a number of prob- 
lems, such as th e correlation between m ass and richness of 
gal axy clusters (lAndreon fc Hurnl |2010| ). bin ary eccentric- 
ity |Hogg et al.ll2010l ), supern ova light curves | j VIandel et all 
l201lf ), mo delling dust SEDs dKellv et alll2012l) and extinct 
tion laws llFoster et al.l l 2012T ). and turbulence in the ISM 
|Shettv et al.ll2012f ). to name a few applications. Here, we 
introduce a general hierarchical Bayesian method for linear 
regression, and to illustrate its applicability we estimate the 
parameters of the KS relationship in local galaxies. 

In the next Section, we provide an overview of hier- 
archical modelling, and describe the hierarchical Bayesian 
framework. We demonstrate the accuracy of the method on 
synthetic datasets in Section [3] After a brief discussion on 
the observational datasets in Section 3] we present results 
from the application of the method on the B08 observations 
in Section [5] In Section [6] after listing some caveats and fu- 
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ture prospects, we offer an interpretation of our results and 
conclude with a summary. 



2 MODELLING METHOD 

In this section, we describe the modelling method we em- 
ploy to estimate the parameters of the KS relationship. As 
Bayesian inference is still not extensively employed in as- 
trophysics research, we describe why it may be preferred 
over traditional methods. We begin by motivating hierar- 
chical modelling, in the context of the KS law. We then pro- 
vide a general description of Bayesian inference, followed by 
the modelling method and an outline of the fitting routine. 
A general introduc tion to Bayesian analysis can be found 
in iKruschkd l|201ll ). For thorough descriptio ns of hierarchi- 
cal Ba yesia n methods, we r efer t he reader to lGelman et al.l 
(|2004h and lGelman fc Hilll (|2007m . 



strongly dependent on the adopted value of the "Xco fac- 
tor" , the conversion of CO intensity to H2 surface density, for 
which recent efforts have clearly demonstrated varies with 
enviro nm ent (see e.g. Glover fe Mac Low 2011 ; Shetty et all 
2011allbT; iNaravanan et al.l |2012| ; iFeldmann et all l2012al ; 
Sandstrom et al.l 12012! '). "in this work, we focus on the es- 
timated measurement uncertainties (e.g. due to calibration) 
on Esfr and E mo i directly, though we preview future efforts 
where Xco will be treated self-consistently in the modelling 
method. 

Our goals are to estimate the parameters of the KS 
relationship for each individual galaxy as well as the univer- 
sal values, including a rigorous treatment of uncertainties 
and the scatter about the regression line. On the individual 
galaxy level, each galaxy is considered to have its own rela- 
tionship between Esfr and E mo i. The parameters governing 
the Kennicutt-Schmidt relationship are the power-law index 
N and the coefficient, A. After transforming Equation Q] to 
log space, we have: 



2.1 Motivation: hierarchical modelling with 
measurement error 

The parameters of the Kennicutt-Schmidt (KS) relationship 
are governed by Equation [1] relating Esfr and E mo i. One 
of our primary goals is to estimate the distribution of KS 
parameters for a population of galaxies, e.g. the mean and 
dispersion in KS slopes for the galaxy population. Estimat- 
ing the KS relationship requires measuring EsFR-E mo i pairs 
within a galaxy, and for many individual galaxies. As such, 
the process of estimating a universal KS relationship is in- 
trinsically hierarchical^ 

Of course, the star formation properties in each galaxy 
will likely be influenced by physical processes that may or 
may not be direct ly associated with the surface density (e.g. 
iLerov et al.ll2008T l. Local or large scale processes, for exam- 
ple, metallicity, magnetic fields, molecular gas fractions, tur- 
bulent levels, or rotation may all influ ence the st a r form ation 
properties, besides E mo i. Moreover, IShi et all (|201lD find 
that Esfr portrays a stronger correlation with the stellar 
surface density compared to the gas surface density across 
galaxies with different morphologies. Therefore, the Esfr- 
E m oi relationship in different galaxies may not necessarily be 
expected to follow a single trend, and there may be scatter 
about any fit KS law as formulated by Equation [T] Indeed, 
uniform analyses of a sample of galaxies by B08 has resulted 
in a range of indices 0.84 - 1.12. 

Another unavoidable caveat in fitting a model is the ef- 
fect of measurement uncertainties. Measurement uncertain- 
ties can produce significan t biases when fitting a model (e.g . 
lAkritas fc Bershadvl fl99rj ; IWeiner et all 120061 ; iKellvl 120071) . 
These uncertainties will also contribute to the scatter about 
any regression line, which should be quantified during the 
fit. Further, there are additional sources of uncertainty in 
the conversion between observables (e.g., UV, IR or CO in- 
tensity) and Esfr and E mo i needed for evaluating the KS 
relationship. In particular, the measurements of E mo i are 



2 Different terminology, including "multilevel" or "random ef- 
fecta" has been used in place of "hiera rchical" modelling (e.g. 
iGelman et alll2004l ; IGelman fe Hilill2007f) . 



log(E S FR.) — A + N log(E mo i) + e scat (2) 

where A = log(a). The scatter about the regression line is 
tscat , assumed to have mean and dispersion a scat , which is 
one of the parameters to be estimated from the method. 

In the evaluation of Equation the parameters for each 
galaxy should be related to the universal values of those pa- 
rameters. Under a hierarchical framework, both the individ- 
ual and universal, or group, parameters are estimated simul- 
taneously. We carry out the fit under a Bayesian framework, 
which is ideally suited for evaluating all the parameters of a 
hierarchical model. 



2.2 Bayesian Inference 

Bayes' Theorem allows for the evaluation of probability V 
of a set of parameters O given the observed data T>: 

T{0\V) ocV{V\Q)V(Q). (3) 

Here, P(O) is the prior on O, where O is a vector of all 
the parameters defining a model. The parameters making 
up the set O are described below, and includes the slope of 
the KS law N, for each individual as well as the group value. 
The other term on the right hand side, V(T)\Q), is the like- 
lihood, which is the probability of the data given the set Q. 
The outcome of Bayesian inference is the posterior V(Q I'D), 
which is the probability distribution function (PDF) of the 
model parameters O given the data T>. 

The next two subsections describe the measurement and 
the full hierarchical model, defining the likelihood and pri- 
ors. We use standard statistical notation in describing how 
quantities are conditionally related and their distributions. 
Namely, y\x indicates a variable y given a value of x. And, 
y\n,cr 2 ~ Af(fi, o" 2 ) denotes that y is drawn from a normal 
distribution Af, given a mean value /i and variance a 2 . The 
mean value of a vector x is denoted by x. In the model, 
we use Gamma functions T(s,r) for the distributions on the 
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inverse of the variance, with s and r the shape and rate 
parameters^ 

2.3 The measurement model 

Due to observational uncertainties, the measured values 
E m oi and Esfr are related to the true values E mo i and Esfr 
by 

log EsFR = log EsFR + £SFR (4) 

log E mo i = log E mo i + e mo i (5) 

Here, esfr and e mo i are the random measurement errors, 
assumed to have mean and fixed known (or estimated) 
dispersion ctsfr and <T mo i, respectively. 

2.4 The Hierarchical Model 

For each galaxy j (j = 1, 2, J), observations provide mea- 
surements Emoi^j, EsFR.ij, where i (i = 1, 2, Ij) indicates 
individual measurements within a given galaxy J. Along 
with those measurements, we have their estimated uncer- 
tainties parametrised by <r mo i,jj and ctsfr, ij- As the desired 
KS parameters relate the "true" values of Esfr with E mo i, 
we need to estimate Esfr and E mo i from the measurements 
and uncertainties. The following set of conditional proba- 
bility distributions denotes the relationships between those 
parameters on the individual galaxy level in the hierarchy, 
as well as the relationship between the KS parameters and 
the true values of E mo i and Esfr: 

logEmol^jlEmol^j ~ A/"(logE Ino i l ij J cr mol>1 j) (6) 

logEmol^jlEmolJjO^molj ~ A/"(log E mo lJ , <T 2 mo l,j) (7) 
logE S FR,ij|E S FR,ij ~ A/"(logE S FR,ij,0-SFR >i j) (8) 

logE S FR,ij|A;, iVj, E mo i,ij, a s 2 catJ ~ (9) 
N{A 3 + Ni log 

We have constructed a model using normal distributions for 
all the relevant parameters in the individual galaxy level. 
Note that Equation [5] contains the KS relationship from 
Equation [2] The relationships above require quantities that 
must be evaluated in the group level of the hierarchy, i.e., 
those related to the distribution of KS parameters for the 
galaxy population, such as Aj, Nj, cr S cat,j- 



For the group model, we have: 

Aj | fix, VA ~ A/"(/LtA, Va) (10) 

Nj I fi N , v N ~ Af(jJ,N, v N ) (11) 

logE 

mol,j I /^mol •> "fmol ~ A/"(Mhio1, Wmol) (12) 

l/o- 2 mo ij I s mo i, r mo i ~ r(s mo i, r mo i) (13) 

1/^scatJ I ^scat , '"scat ^ F(s sca t , 7* sc at ) (14) 



Equations [6l — [Til describe the quantities for the individuals 
and the population. Those quantities with subscript j re- 
fer to individual galaxy properties, for instance E mo y is the 
mean gas surface density of galaxy j, and fj, mo i is the mean 



3 The inverse of the variance is referred to as the precision, for 
which r distributions are commonly empl oyed for their distribu- 
tions l lKruschkell201ll : lGeIman et al.ll2004) . 



value of the surface density for all galaxies. These equations 
describe how the distributions of individual parameters are 
derived from the group parameters. For example, the indi- 
vidual slopes Nj are conditional on the group slopes and 
variances /xjv and vn, respectively. Similarly, the distribu- 
tions of the group values require assumed distributions on 
another level in the hierarchy. 

The final hierarchical level completes the model setup. 
The assumed distributions of quantities in this level are the 
"hyperpriors" which govern the group distributions, which 
in turn lead to the individual parameters. We construct 
broad hyperpriors, as we would like to rely primarily on 
the data to estimate the group parameters: 



At A ~ JV(Q, 100) (15) 

1/va ~ r(0.1, 0.1) (16) 

/j, N ~Af(0, 100) (17) 

i/«n ~r(o.i,o.i) (is) 

Mmol ~ Af(0, 100) (19) 

iAw~r(o.i,o.i) (20) 

^mol | ^mol ; *^mol " ^mol / ^mol (^^-) 

T*mol | ^mol 7 rfmol — Tlmol/ ^mol (22) 

m mol ~r(l,0.1) (23) 

d m oi~r(l,0.1) (24) 

Sscat | ^scat, d sc &t — ^scat/ ^scat (^5) 

^"scat | ^-scat j dsc&t — T^-scat Meat (26) 

msoat ~r(l,0.1) (27) 

rfscat ~r(l,0.1) (28) 



The hierarchical model described in Equations [6] - [28] 
contains all the relevant conditional dependencies and dis- 
tributions to evaluate Equation [3] The hyperpriors of the 
group KS parameters (Eqns. [15] and 1 1 Tj) are normally dis- 
tributed, with very large variances. These wide distributions 
allow the data to govern the final estimates of the PDFs of 
Ha and /ijv. For the variances of the individual parameters, 
va and vn, as well as for the scatter term, their inverses 
are modeled as Gamma functions, for which the estimated 
values are again primarily governed by the data. 

The main parameters we are interested in are those 
that define the KS relationship, at both the individual and 
group levels, and the scatter term about the regression line 
in Equation [2] O' = (Aj, Nj, fiA, fJ-N, Cs C at,j)- We simply 
marginalize over the other parameters required for estimat- 
ing the regression parameters. 

Modern Markov Chain Monte Carlo (MCMC) tech- 
niques allow for efficient sampling of the full parameter 
space. We employ the "Gibbs sampling" method for gen- 
erating random draws from the posterior. The posterior is 
constructed by multiplying the conditional relationships de- 
fined by Equations [6l- [28l At each step in the MCMC chain, 
Gibbs sampling generates a random draw from the posterior 
by cycling through the conditional distributions of each pa- 
rameter, such that a new value of each parameter is drawn 
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from its distribution conditional on the curre nt values of all 
the o ther parameters and the data (see, e.g.. iGelman et all 
120041). We use the JA GS software (Just Another Gibbs Sam- 
pler, |Plumme3[2003 j3 within the R programming language^ 
to carry out this analysis. In our execution, we run three 
seperate MCMC chains, with each chain containing 25,000 
steps. With this choice of a large number of sampling steps, 
the convergence of parameter estimates is easily achieved. 
Before proceeding to evaluate this model on the observed 
data, we use synthetic data with known parameters to ver- 
ify the accuracy of this hierarchical framework. 



3 METHOD TESTING 

To test the accuracy of the method, we apply it to syn- 
thetic datasets, and compare the parameter estimates to 
the adopted values. As we describe below, many aspects 
of the model assumptions are not strictly satisfied. These 
discrepancies allow us to test the sensitivity of the model 
assumptions on the derived parameter estimates. 

For our initial tests, the data are constructed to re- 
semble the observed sample we analyze in Section 5. We 
construct two groups, each containing 7 individual galaxies, 
which is the number of galaxies in the B08 sample we ana- 
lyze in the next Section. For each galaxy, we choose values 
for the slope, intercept, and scatter and evaluate Equation 
[5] The chosen slopes and intercepts are results from fitting 
the B08 sample: for Group A, the Bayesian inferred values 
from this work (Section[5]), and for Group B, the bisector fits 
(from B08). We generate 50 log(E mo i) values for each galaxy, 
which are uniformly distributed between 0.1 and 2.2, a range 
comparable to log(E mo i) from B08. We add a value drawn 
from a Gaussian distribution with mean and 0.1 standard 
deviation for the intrinsic scatter term in Equation [2] For 
the 50 simulated values of E mo i and Espr of each individual 
galaxy, we construct E mo i and Esfr by adding noise drawn 
from normal distributions with zero mean and standard de- 
viations corresponding to 25% and 50% of S mo i and Esfr. 
These nois e levels are the estim ates provided in lLerov et al.l 
(|2009h and lBigiel et al.l (|2010h . We then fit the hierarchical 
model over the full range of Esfr and E mo i. The first four 
columns of Tables [T] and [5] show the intercepts, slopes, and 
scatter terms of the test galaxies in groups A and B, respec- 
tively. The group parameters in the last row are simply the 
mean values of the intercept and slopes of the individual 
galaxies. 



3.1 Bayesian Parameter Estimates 

The results of the Bayesian fit on the datasets are shown in 
Columns 5 — 5 in Tables [1] and [2] The first four columns 
correspond to the individual test galaxies, indicating the 
adopted values of the intercepts, slopes, and scatter terms, 
respectively. The fifth and sixth column show the median 
and 2a (95%) range in the estimated intercepts of each indi- 



4 JAGS is freely available from http://mcmc-jags.sourceforge.net 

5 R is freely available from http://cran.r-project.org 



vidual test galaxy0 Similarly, the seventh and eighth column 
shows the estimated slopes, and its 2a range. The last col- 
umn shows the posterior median of the scatter term a sca t in 
Equation [2] . The last row of the tables shows the adopted 
and estimated quantities of the group distribution. 

The median of the posterior, which provides estimates 
of the most probable parameters, is similar to the true val- 
ues. Within 95%, the posterior distributions of the slope 
and intercept all contain the true value. The median of the 
scatter term is also an accurate estimate of a scat Q This in- 
dicates that the hierarchical fitting has accurately recovered 
the true parameters. It is also evident that the PDFs of the 
estimated group parameters include the true values, demon- 
strating that the method accurately recovers both the group 
and individual parameters, to a high degree of accuracy. 

Figures [T] and [5] show the median (gray circle) and la 
distribution (gray contour) of the estimated slope and inter- 
cepts, along with the true value (black crosses). In all but one 
case, the la contours for the individual parameters contains 
the true value. For individual Galaxies A4 and B4, the true 
value lies just outside the la contour, but is within the 2a 
interval, as indicated in Tables [T] and In the hierarchical 
model, the Bayesian estimates of the individuals are affected 
by the global group parameters. This effect, called "shrink- 
age", drives the overestimate of the slopes for galaxies A4 
and B4 towards the group inferred value, but nevertheless 
the true value is contained within the 2a confidence interval. 



3.2 Comparison with ordinary least squares 
methods 

In order to assess how the hierarchical Bayesian mod- 
elling compares to other fitting methods, we perform three 
OLS fits: the y\x, x\y, hereafter OLS(EsFR|E mo i), and 
QLS(S mo i|SsFR) , respectively, and an OLS bisector (e.g. 
Ilsobe et al.lll990h . The estimated slope of the OLS fits is 
dependent on the covariance of the data, and variance of 
the quantity labeled as the predictoijf] or what is considered 
the independent quantity, as further discussed below. In or- 
der not to place a preference on Esfr or E mo i, the bisector 
fit has emerged as the preferred method in fitting KS rela- 
tionships. 

The estimates from these three OLS fitting methods, 
however, should not be interpreted as the s ame quantities, 
as di scussed by Isobe et al. (1990, see also IGelman fc Hilll 
120071 ). An OLS(y |a;) estimates the conditional mean of "y 
given x". Accordingly, as the constructed relationships, or 
simulations, describe how the mean value of log(EsFR) de- 
pends on log(E m oi), the OLS(EsFR|E mo i) is the most appli- 
cable. The OLS(E mo i|EsFR) estimates how the mean value 
of log(E mo i) depends on log(EsFR), and the bisector esti- 
mates are weighted means between the OLS (E mo i| Esfr) and 
OLS(EsFR|E mo i) - the details of the slope estimates of the 



6 It is common practice to conservatively consider the 95% in- 
terval as the range in plausible p arameter estimates (see, e.g. 
IGelman et alll2004 lKruschkdl201ll) . 

7 The 95% confidence interval for cr sca t is not shown, but is ~[0.09 
— 0.45], and always contains the true value (0.1). 

8 The predictor, or covariate, is usually the quantity plotted on 
the "x"-axis. 
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Table 1. Adopted and Baycsian inferred parameters for Test Group A 



Subject 


True A 


True N 


True (Tscat 


Bayes A 


Bayes 2a a 


Bayes N 


Bayes 2ctjv 


Bayes t7 SC at 


Test Galaxy Al 


-2.77 


0.72 


0.1 


-2.79 


[-2.9, -2.7] 


0.74 


[0.64, 0.84] 


0.12 


Test Galaxy A2 


-3.21 


0.88 


0.1 


-3.23 


[-3.4, -3.1] 


0.86 


[0.76, 0.97] 


0.12 


Test Galaxy A3 


-3.18 


0.89 


0.1 


-3.14 


[-3.3, -3.0] 


0.89 


[0.79, 0.99] 


0.12 


Test Galaxy A4 


-2.81 


0.78 


0.1 


-2.91 


[-3.0, -2.8] 


0.82 


[0.72, 0.92] 


0.12 


Test Galaxy A5 


-2.87 


0.74 


0.1 


-2.91 


[-3.0, -2.8] 


0.76 


[0.66, 0.86] 


0.12 


Test Galaxy A6 


-3.22 


0.91 


0.1 


-3.13 


[-3.3, -3.0] 


0.87 


[0.78, 0.98] 


0.12 


Test Galaxy A7 


-2.82 


0.92 


0.1 


-2.78 


[-2.9, -2.7] 


0.90 


[0.80, 1.00] 


0.12 


Group A Parameters 1 


2.98 


0.83 


0.1 


2.99 


-3.2, -2.7] 


0.84 


[0.65, 1.0] 


0.12 


1 2a range in estimated dispersions of the group intercept and slope are y/vji 


=[0.17, 0.57], and Jm = 


[0.13, 




0.42]. 


















Table 2. Adopted and Bayesian inferred paramet 


ers for Test Group B 










Subject 


True A 


True N 


True o-scat 


Bayes A 


Bayes 2a a 


Bayes TV 


Bayes 2a j\r 


Bayes o- sca t 


Test Galaxy Bl 


-2.29 


0.84 


0.1 


-2.30 


[-2.4, -2.2] 


0.85 


[0.75, 0.95] 


0.12 


Test Galaxy B2 


-2.53 


0.92 


0.1 


-2.56 


[-2.7, -2.4] 


0.91 


[0.81, 1.01] 


0.12 


Test Galaxy B3 


-2.15 


0.96 


0.1 


-2.48 


[-2.6, -2.3] 


0.96 


[0.86, 1.06] 


0.12 


Test Galaxy B4 


-2.26 


0.92 


0.1 


-2.35 


[-2.5, -2.2] 


0.95 


[0.86, 1.05] 


0.12 


Test Galaxy B5 


-2.33 


1.00 


0.1 


-2.36 


[-2.5, -2.2] 


1.01 


[0.90, 1.11] 


0.12 


Test Galaxy B6 


-2.54 


1.12 


0.1 


-2.45 


[-2.6, -2.3] 


1.08 


[0.97, 1.18] 


0.12 


Test Galaxy B7 


-2.12 


0.95 


0.1 


-2.09 


[-2.2, -2.0] 


0.94 


[0.84, 1.06] 


0.12 


Group B Parameters 1 


-2.37 


0.96 


0.1 


2.37 


-2.6, -2.1] 


0.96 


[0.77, 1.14] 


0.12 



1 2a range in estimated dispersions of the group intercept and slope are y/vX =[0.15, 0.53], and y/vpj =[0.13, 
0.43]. 

different OLS fits are provided below. The three OLS re- 
sults will provide different estimates, as they are different 
statistics derived from the joint distribution of the measure- 
ments. The choice of which quantity should be identified as 
the "x" or "y" variable should be motivated by the scien- 
tific goals. For the KS law, we are primarily interested in 
how Esfr, varies with £ mo i. As a result, we would expect 
the OLS(EsFR|S mo i) to provide more accurate estimates 
of the underlying parameters of the simulation for a single 
galaxy, in the absence of noise. Yet, the simulation includes 
noise and intrinsic scatter, important properties that a sim- 
ple OLS(£sFR.|£ mo i) does not account for. As the bisector fit 
has been employed to estimate the KS parameters in previ- 
ous works, we also compare those estimates to the Bayesian 
result. Following conventional practice, for estimating the 
group parameters we simply apply the fit to all data from 
the seven galaxies together. 

Tables 3 and 4 provide the results of the fits. We show 
the 2a uncertainties along with the OLS best fit parame- 
tersQ These best fit values, along with the la uncertainties 



9 For the bisector, the formal uncertainties are not given, as 
they are similar in magnitude to the OLS(EgFR|E mo i) and/or 
OLS(S mo i|Egpa) fit. A more conservative uncertainty estimate 



for the OLS(EsFR,|£ mo i) and OLS(E mo i|EsFR) estimates, are 
also shown in Figures [1] and [2] 

Tables 3 and 4, as well as Figures [1] and [2] indicate 
that the bisector consistently overestimates the slopes (and 
underestimates the intercepts), for all individuals within 
the groups, as well as the group estimate. The results of 
the OLS(EsFFt|E mo i) and OLS(E mo i|EsFR,) are rather dis- 
crepant, leading to a bisector parameter estimate falling be- 
tween those results. 

That the estimates from the OLS fitting vary between 
the three different methods, as well in comparison to the 
Bayesian estimates, can be understood by the definition of 
the slopes and intercepts in least-squares fitting. The OLS 
para meters have been extensively discussed and presented 
(e.g. Ilsobe et al.lll990l ; iKellvl 120071 ). so we simply state the 
slope definitions here. For an OLS(EsFa|E mo i) fit, the slope 
-^s S pr|s i is computed by taking the ratio of the covari- 
ance (Cov) between the data and the variance (Var) in the 
predictor E mo i: 

AT Cov(S mo i, Esfr.) (0Q . 

TV£ SFR |E moI - — — ^ (29) 

Var(E mo i) 



employed by B08 is the range in the parameters estimated by 
OLS(E SF R|S mol ) and OLS(S mol |S SPR ). 
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Figure 1. Slope and intercept of test galaxies in Group A. Black cross shows the true values. Red and orange squares show the 
OLS(EsFR,|£ mo i) and OLS(E mo i|Sspa) results, with their lcr uncertainties, respectively. The gray circles indicate the estimate provided 
by the median of hierarchical Bayesian posterior result, and the contours mark the lcr deviation. The filled blue squares mark the bisector 
estimates. The last panel on the bottom row shows the group parameters and fit estimates. 



For the OLS(E mo i|EsFR), the fit slope is the inverse of the 
desired quantity in Equation [2] so that: 



Ny. 



Var(E s 



Cov(E mo i, Esfr) 



(30) 



The bisector slope Nb[ s is a weighted mean of the 
OLS(E S FR|E mo i) and OLS(E mo i|E S FR) slopes. 



^B i8 = (iv Emol | EspR + Ar EspR|Emol ) 1 



NESFRlSmol^SmollSSFR _ 1 + 



(31) 



./(i+m Z xTTiv? Z ) 



Equations [29] -[31] illustrate what we stated earlier: that 
the three different slope estimates are just three different 
statistics (or summaries) derived from the same joint dis- 
tribution. Choosing one estimate over the other does not 
imply that one quantity "causes" the other, as is sometimes 



claimed to be implied by the terminology of "independent" 
and "dependent" variables. However, the different slope esti- 
mates do differ in interpretation. The OLS(EsFR|S mo i) slope 
describes how the mean value of Esfr varies with E mo i while 
the OLS(E mo i|EsFR) slope describes how the mean value of 
E mo i changes with Esfr- Thus, both OLS slopes are easily 
interpretable. In contrast, the bisector slope is a weighted 
average of the two OLS slope, and it is not clear how this 
should be interpreted. 

The OLS slopes are therefore strongly dependent on 
the statistical properties of Esfr and E mo i- For the syn- 
thetic data of both groups, Var(E mo i) = 0.39, pooling all 
data from each galaxy together. For Group A, Var(EsFR) = 
0.33, and Cov(E mg i, Esfr) = 0.32. In Group B, Var(E SF R) 
= 0.41, and Cov(E mo i, Esfr) = 0.36. The covariances and 
Var(EsFR) of the two groups are similar, as the adopted 
slopes only differ by ~ 10%. More importantly, the covari- 
ance is < 1. As the covariance occurs in the denominator 
of the OLS(E mo i|EsFR) slope, but in the numerator of the 
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Figure 2. Slope and intercept, along with parameter estimates, of test galaxies in Group B. Symbols and contours are denned in the 
same fashion as in Figure [TJ 



OLS(EsFR|S mo i) slope (and the variances of the chosen pre- 
dictors are similar), the resulting iV Emol | EsFR > iV£ SFR |£ mol . 

Notice that the OLS(EsFR.|£ mo i) parameter estimates 
are closer to the true value than the OLS(E mo i|EsFR.) 
estimates. This is to be expected, because as described 
above the OLS(EsFR,|E mo i) estimates how the mean value of 
log(EsFR) depends on log(E mo i), and the simulated datasets 
are constructed with a linear relation between those quan- 
tities. For these data, the Cov(E mo i, Esfr) is less than the 
variance of either quantity, thereby leading to iV SsFR j Smol < 
1 and iVs mo i|£ SPR > 1. The OLS(E mo i|E S FR.) result there- 
fore drives the bisector slope towards larger values, leading 
to the systematic overestimates shown in Tables 3-4 and 
Figures (TJ2] The overestimate in the bisector slope is not 
as drastic for Test Group B, because the underlying slopes 
are closer to unity to be g in wi th. Further, note that tests 
performed bv llsobe et al.l (|l990l ) (see their Table 2) produce 
bisector slopes of unity for a number of scenarios, where the 
y\x ot x\y slopes are far from unity. In fact, the bisector fit is 
expected to produce a slope of one when x and y are statisti- 
cally independent, indicative of the difficulty in interpreting 



the bisector. If the scatter about a linear relationship were 
very low, for instance due to small measurement uncertain- 
ties, then the OLS(EsFR|S mo i) and OLS(E mo i|EsFR), and 
correspondingly the bisector, would accurately recover the 
regression parameters. Taken together, one can assess the 
accuracy of the bisector estimated regression parameters by 
inspecting both the OLS(j/|x) and OLS(a%) results. If they 
are highly discrepant, then the bisector result, which will 
fall between the OYS{y\x) and OLS(x|?/) estimates, should 
not be considered to provide accurate parameter estimates 
of a linear relationship. 

3.3 Effect of Model Assumptions 

For the two synthetic datasets considered so far, the nor- 
mal distributions in the intrinsic scatter and measurement 
uncertainties matches the prior distributions in the hierar- 
chical model. In order to test the sensitivity of these as- 
sumptions, we consider an another synthetic dataset with 
uniform distributions for the uncertainty and scatter. Ad- 
ditionally, compared with the previous tests the synthetic 
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dataset we consider here has more individual galaxies, with 
different numbers of (E mo i, Esfr) pairs for each individual. 

The intrinsic slopes and intercepts of each synthetic 
galaxy in Test Group C varies between 0.7 to 1.5 and —3.0 to 
—2.0, respectively. The population mean value of the slope is 
1.1, and the mean intercept is —2.5. The intrisic scatter term 
is uniformly distributed centered on 0, with extent 0.3. To 
construct the noisy measurements E mo i and Esfr, we add 
a random value drawn from a uniform distribution centered 
about 0, with extents 0.2 and 0.25, respectively. We carry 
out the Bayesian fit as before, where we assume Gaussian 
distributions, with la noise estimates equal to the width of 
the uniform distributions employed to construct the noisy 
datasets. We also compare the Bayesian results with the di- 
rect OLS fits. 

Table 5 shows the intrinsic and Bayesian estimated pa- 
rameters. For the slope and intercept parameter estimates of 
the individuals, the true values are always contained within 
the 2(j interval, except for the slope of Test Galaxy CI and 
intercept of Test Galaxy C2, which have the fewest number 
of datapoints. Further, for these individuals the intrinsic pa- 
rameters are far from the mean value, so the Bayesian esti- 
mates are affected by shrinkage, which was described above. 
Notice that the group parameter estimates also recover the 
intrinsic values, and that the posterior median is close to the 
true values. 

The OLS fit results for the population in Table 6 show 
a marked difference compared to the Bayesian estimates. 
Even at the 2a level, neither the OLS(EsFR|E mo i) and 
OLS(E mo i|EsFFt) can recover the true value. Consequently, 
the bisector estimates are also discrepant from the intrin- 
sic parameters. For the OLS group estimates, all datapoints 
are fit simultaneously, so those individuals with the largest 
number of datapoints dominate the final fit. Therefore, the 
OLS parameter estimates tend towards smaller slopes and 
larger intercepts, thereby underestimating the group slope 
and overestimating the intercept, even when considering 
the full 2a interval. This discrepancy, especially for the 
OLS(EsFR,|E mo i) case, is larger than that from Test Groups 
A and B, due to the variable number of datapoints between 
individuals, as well as the higher noise and scatter levels. 

This test has shown that the Bayesian posterior can 
reasonably recover the slopes and intercepts of the individ- 
ual even though the assumed distributions for the noise and 
scatter terms are incorrect. The 2a ranges of the posterior 
only provide erroneous subject estimates when the individ- 
ual has very few datapoints, and/or the individual parame- 
ters are far from the population mean values. Nevertheless, 
the population values are recovered, indicating a marked im- 
provement of the Bayesian result compared to the OLS fits. 

Notice that the Bayesian estimate of the scatter is sys- 
tematically lower than the intrinsic value. This occurs be- 
cause the chosen la value for the noise is set to the extent of 
the true uniform noise distribution. Accordingly, in the syn- 
thetic data there are no measured datapoints occuring at 
greater than la from the true values, whereas the Bayesian 
model assumes that such noisy measurment do exist. Due to 
this overestimate of the noise properties, the Bayesian model 
erroneously designates some of the intrinsic scatter to be as- 
sociated with noise uncertainty, and thereby underestimates 
the intrinsic scatter. 

Clearly, the accuracy of the Bayesian fits are dependent 



on the assumed distributions. If there is good correspon- 
dence between the assumed and intrinsic distributions, as 
in Test Groups A and B, the Bayesian results will be highly 
accurate. If there is large discrepancy in the assumed and in- 
trinsic distribution, then the accuracy Bayesian result will be 
degraded, as for Test Group C. Nevertheless, we have shown 
that for uniformly distributed noise and scatter properties, 
the hierarchical Bayesian method employing normal priors 
for these properties can nevertheless accurately recover the 
population parameters, especially when compared with the 
OLS methods. The magnitude of the discrepancy is of course 
dependent on the noise and scatter properties. Further tests 
may be needed for ascertaining the reliability of the model 
on datasets with more discrepant distributions. When inter- 
preting the results from the application of the method to 
observed datasets, one must of course bear in mind that the 
accuracy is dependent on the priors. As we have considered 
two tests in Section ^. 2l that we consider to be representative 
of real observations, we have confidence that the hierarchical 
model is appropriate for the observed datasets we analyze 
in Section [S] 

3.4 Summary of method testing 

We have tested the hierarchical Bayesian model described 
in Section [2] using three sets of synthetic data. The datasets 
consists of groups containing numerous individual galaxies. 
Each galaxy is characterized by some chosen power-law rela- 
tionship between Esfr and E mo i, as well as intrinsic scatter 
about this relationship. We include noise, similar to esti- 
mated uncertainties from the observations, to produce Esfr 
and Emoi- The hierarchical Bayesian method can accurately 
recover the underlying parameter estimates, as the peaks of 
the posterior are very close to the true values for each indi- 
vidual as well as the mean population parameters, and the 
whole 2a range includes the correct solution. 

The synthetic datasets do not strictly satisfy all aspects 
of the priors in the hierarchical model. Namely, for Test 
Groups A and B, the slope and intercepts of the seven in- 
dividual galaxies are not drawn from a normal distribution 
for the group. Similarly, the gas surface densities of each 
galaxy are not normally distributed, but rather uniformly 
distributed. That the method is able to accurately recover 
the underlying slopes and intercepts indicates the insensi- 
tivity of the parameter estimates to these priors. With Test 
Group C, we examine the situation where the intrinsic and 
noise distributions are uniformly distributed, which is dis- 
crepant from the assumption of Gaussian distributions in 
the hierarchical model. This group has twenty individuals, 
each containing a different number of datapoints. For this 
test, the Bayesian fit again accurately recovers the popu- 
lation slopes and intercepts. The 2a range of the posterior 
only misses the slope and intercept for the individuals that 
have the fewest datapoints and the most discrepant slope 
and intercept relative to the population values. 

The Bayesian fit proves to be more accurate than the 
OLS fits, especially compared to the often employed bisec- 
tor fit. As discussed, however, the bisector is expected to 
estimate a slope which falls between OLS (Esfr |E mo i) and 
OLS(E mo i|EsFR) slopes, and is therefore difficult to inter- 
pret. The bisector should not be considered to provide an 
estimate of N in Equation [2] The hierarchical Bayesian re- 
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suit does provide an estimate of this quantity, and its results 
are easily interpretable: we can estimate the mean value of 
Esfr given E mo i, for each individual galaxy as well as for 
the population. 



4 OBSERVATIONS 

We are now in a position to apply the Bayesian method on 
observational datasets. For its first application, the dataset 
consists of a sample of seven spiral g alaxie s from B08, and 
is publicly available in iBigiel et al.l 1)20101 ). For six of the 
galaxies, E mo i is measured in 750 pc-sized regions from the 
"CO J = 2 - 1 observations in the HERACLES survey 
l|Lerov et al.l I2009T ). For the remaining galaxy, M51, E mo i 
is computed from 12 C O J = 1 — obs ervations from the 
BIMA SONG survey (jHelfer et al J 12003(1 . The CO intensi- 
ties are converted to gas surface densities using a constant 
Ago factor, and a constant 12 CO ( J = 2 - I) /{J = 1 - 0) 
ratio (see B08 and lBigiel et alJboiOl for details) . We explore 
variations in Aco in Section ^. 21 To estimate Esfr, B08 use 
the combination of 24 um intensities from the SINGS sur- 
vey dKennicutt et alj2003h. an d UV fluxes from the GALEX 
surv ey dGil de Paz et alj|2007l ). We refer the reader to B08 
and iLerov et al.l l|2008l . 12003 ) for details of these observa- 
tions. 



5 RESULTS 

5.1 Application to B08 Data 

The results of the hierarchical Bayesian fit on the seven spi- 
ral galaxies from B08 are shown in Table [7] The median 
value of the slopes are all lower than unity, and for four of 
the galaxies, NGC 5194 (M51), NGC 5055, NGC 6946, and 
NGC 628, a linear slope can be excluded at > 95%. The 
median of the inferred group slope is 0.84, with unity falling 
just inside the 2a interval. 

We can verify that the model can reasonably reproduce 
the data by over-plotting representative regression lines con- 
structed from the posterior parameters estimates. From the 
posterior PDF, we can draw a large number of A, N, and 
dscat values, and evaluate Equation [2] at various E mo i (for 
each galaxy). Figure [3] shows the data and 50 random draws 
from the posterior (gray lines) for each galaxy, demonstrat- 
ing that the Bayesian model is consistent with the data. For 
comparison, the red dashed line shows the linear relation- 
ship inferred from a bisector fit on all data. The last panel 
in Figure|3]only shows the data from all the galaxies and the 
associated bisector result. As this ensemble is not directly 
used to infer the group parameters, we do not overlay the 
inferred group lines. 

The Bayesian inferred parameters are rather different 
from the parameters inferred from the bisector fit. The bi- 
sector results are adopted as the true values of our synthetic 
dataset in Group B, and are shown in Table [2] in Section [3j 
Recall that the hierarchical Bayesian fit on that Group ac- 
curately recovered the model parameters, as shown in Fig- 
ure [2] and Table [2] However, the bisector fit produces slope 
estimates which are larger than the model slope, and under- 
estimated the intercept, similar to the results from Group 
A. In Group A, the results of the Bayesian fit from the 



observed data (Table [2| were adopted as the parameters. 
For that test group, the Bayesian result again accurately 
recovered the adopted parameters. Recall that the bisec- 
tor slope is a weighted mean of the OLS(EsFFt|E mo i) and 
OLS(E mo i|EsFa) estimates. For the ensemble B08 data, the 
OLS(EsFFt|E mo i) and OLS(E mo i|EsFit) slopes (and 2a un- 
certainty) are 0.84±0.04 and 1.19±0.04, respectively. As dis- 
cussed in Section [3~2l the large discrepancy between the OLS 
slope estimates is a clear indicator that measurement uncer- 
tainties in Esfr and E mo i are affecting the OLS estimates. 

The observed and synthetic data also differ in the de- 
tailed statistics of Esfr and E mo i. For all data pooled to- 
gether, Var(E mo i) = 0.15, Var(EsFR) = 0.15, and Cov(E mo i, 
Esfr) = 0.12. These values are a factor of 2—3 lower than 
the statistics of the synthetic data discussed in Section |3. 21 
This indicates that there are some differences between the 
synthetic dataset we considered, and the observed data. 
Much of the difference can be attributed to E mo i. For the 
synthetic datasets in Section [3] we simply considered a uni- 
form distribution in E mo i, whereas the measured E mo i in 
each galaxy is not necessarily uniformly distributed. Nev- 
ertheless, the tests in Section showed that the Bayesian 
parameter estimates are not sensitive to the intrinsic E mo i 
distribution. Further, note that the o- sca t estimates in Table 
[7]are all ^ 0.1. This suggests that a power law relationship 
can reasonably describe the observed trends. Taken together 
with the results of the tests, any discrepancy in the true and 
prior distributions of E mo i and a" scat likely does not strongly 
effect the parameter estimates. 

However, the assumed values of the conversion factors 
may play a stronger role in the determination of the KS pa- 
rameters. We have considered E mo i and Esfr directly, such 
that there is no uncertainty in the conversion from lumi- 
nosities to those quantities. Here, we have assumed Aco to 
be fixed at 2xl0 20 cm -2 K _1 km -1 s, but in the next sub- 
section we explore variations in Aco under the hierarchical 
Bayesian framework. 

We can also compare the Bayesian result to the 
OLS(EsFR|E mo i) and OLS(E mo i|EsFR) estimates. As de- 
scribed in Section 13.21 the OLS(EsFR|E mo i) should pro- 
duce the most accurate estimates, because by definition it 
estimates the mean value of Esfr given E mo i. To evalu- 
ate the OLS slopes, we only need to consider the covari- 
ances and variances of the data. The ratio of Cov(E mo i, 
Esfr) /Var(E mo i) is 0.8, equivalent to the ratio of the ensem- 
ble in Group A, and by definition to the OLS(EsFR|E mo i) 
fit slope of both the observed sample and the synthetic data 
in Group A. The OLS(EsFR|E mo i) slope is similar to the 
Bayesian result, though the estimated uncertainty is very 
small, 0.04. This uncertainty is solely statistical, does not 
account for measurement errors, and its low value is pri- 
marily driven by the large number of datapoints 

(9810 

Consequently, the OLS methods significantly underestimate 
the errors, and they do not provide the parameter estimates 



The uncertainty in the slope also depends on the variance in 
x, i.e., on the width of the x distribution. If the x distribution 
is broader, then the uncertainty on the slope decreases. Because 
measurement errors broaden the distribution of x, they also arti- 
ficially decrease the uncertainty in the slope estimate when they 
are ignored. 
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Table 7. Bayesian estimated parameters for the seven spiral galaxies in B08 



Subject 


Bayes A 


Bayes 2a a 


Bayes N 


Bayes 2it/v 


Bayes <T S cat 


NGC 5194 (M51) 


-2.84 


[-3.0, 


-2.7] 


0.72 


[0.62, 


0.83] 


0.06 


NGC 5055 


-3.20 


[-3.3, 


-3.1] 


0.87 


[0.79, 


0.95] 


0.04 


NGC 3521 


-3.20 


[-3.4, 


-3.0] 


0.90 


[0.76, 


1.03] 


0.05 


NGC 6946 


-2.81 


[-2.9, 


-2.7] 


0.78 


[0.70, 


0.86] 


0.11 


NGC 628 


-2.89 


[-3.1, 


-2.6] 


0.76 


[0.51, 


0.95] 


0.05 


NGC 3184 


-3.24 


[-3.4, 


-3.1] 


0.92 


[0.79, 


1.10] 


0.05 


NGC 4736 


-2.83 


[-3.2, 


-2.4] 


0.92 


[0.67, 


1.20] 


0.08 


Group Parameters 


3.00 


[-3.3, 


-2.7] 


0.84 


[0.63 


1.0] 


0.14 




1.2 1.6 2.0 2.4 1.0 1.5 2.0 0.6 1.0 1.4 1.0 2.0 




0.7 0.9 1.1 1.3 0.6 1.0 1.4 1.8 1.0 1.4 1.8 1.0 2.0 

log(E mo i) (M pc" 2 ) log(z m0 |) (M pc" 2 ) log(s mo i) (M pc" 2 ) log(Z m0 |) (M pc" 2 ) 



Figure 3. Hierarchical Bayesian fits (gray lines), along with the data (black symbols), from the B08 sample of seven galaxies (see Fig. 4 
in BOS). Gray lines show 50 random draws from the posterior. For reference, (red) dashed lines show the same linear relationship from 
a bisector fit to all data in the last panel. Note that the axes are different in each panel. 



of the individuals galaxies when all data are fit simultane- 
ously. We therefore caution against the use of such statistical 
uncertainties for characterizing the quality of the linear re- 
gression on large datasets. 

To assess the difference between the hierarchical mod- 
eling and the parameter estimates inferred from all data, we 



perform a direct Bayesian fit to the ensemble. For this analy- 
sis, we do not require a group level in the hierarchical model. 
In this case, we are only interesting in estimating N, A, and 
the scatter term cr scat for the ensemble, shown in the last 
panel of Figure We thus simplify the hierarchical model 
described in Section [2,4! by eliminating the group level, and 
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employ uninformed priors on the slope, intercept, and scat- 
ter terms. We treat measurement uncertainties in the same 
manner as described in Section ^. 41 The posterior median of 
the Bayesian regression analysis of the combined data yields 
X=0.90, with 2a range [0.88, 0.95]. This result is substan- 
tially different from the parameter estimates from the full 
hierarchical model, and would lead to different conclusions 
regarding the KS relationship. Namely, though the inferred 
slope is still less than unity, the slope estimates for two of 
the individual galaxies, M51 and NGC 6946, are excluded 
at the 95% level. Related to that discrepancy, the limited 
range in the 2a estimated slope of the ensemble suggests 
that there may be a "universal" KS index. However, the full 
hierarchical Bayesian result provides a larger range for the 
estimated group parameters and unambiguous differences 
between individual galaxy parameters. That indicates that 
there is likely no universal KS relationship which can accu- 
rately describe the Esfr- S mo i relationship for all galaxies, 
if our assumptions about the conversion factors are valid, as 
further discussed in the next section. The variations in KS 
relationships between galaxies is clearly apparent simply by 
inspecting the datapoints in Figure [3] affirming the variable 
star formation behaviour in different galaxies identified by 
the hierarchical Bayesian fitFI 



5.2 Variations in the Xco factor 

In the analysis thus far, we have assumed that the conver- 
sion from observed intensities to the star formation rates 
and molecular gas surface densities are exact. We thus im- 
plicitly assume that Esfr and £ mo i are directly measured. 
But in actuality, the observed quantities are UV, 24 \im, and 
CO intensities, which are converted to star formation rates 
and gas surface densities using constant conversion factors. 
Uncertainties in the conversion factors have thus not been 
considered in the regression analysis. 

A major advantage of the hierarchical framework is that 
uncertainties in any quantity can be easily incorporated - 
for instance by including additional levels for the conver- 
sion factors. By explicitly implementing another level for 
the conversion factors, the hierarchical analysis can proba- 
bilistically ascertain whether certain values of the conversion 
factors drive tighter KS relationships. Here, we demonstrate 
the capability of the method to treat uncertainties in the 
Xco, allowing for independent variations in Xco between 
each datapoint. In this framework, we are simply consid- 
ering statistical variations in Xco- As we describe below, 
observational and theoretical efforts may further constrain 
this conversion factor. 

In using the observed CO intensity Ico rather than 
fimol, we assume a measurement uncertainty similar to that 
described in Section 12.31 Accordingly, we consider a linear 
regression of the form: 



Table 8. Bayesian estimated slopes and scatter for the seven spi- 
ral galaxies in B08, where the Xqq factor is allowed to vary from 
0.5X10 20 to 8xl0 20 cm" 2 K" 1 km" 1 s 



log(EsF R ) = A + N log(Xcoico) + e» 



(32) 



11 If the KS relationship were "universal", then the population 
variances of the intercept and slope, i>a an d v?f respectively, 
would be (very close) to zero. That we find non-zero values 
v\ = 0.09 and djv = 0.05 also imply that there is no univer- 
sal KS relationship under this framework. 



Subject 


Bayes N 


Bayes 2<jjv 


Bayes cr scat 


NGC 5194 (M51) 


0.67 


[0.57, 0.78] 


0.07 


NGC 5055 


0.86 


[0.78, 0.94] 


0.04 


NGC 3521 


0.88 


[0.76, 1.02] 


0.05 


NGC 6946 


0.74 


[0.66, 0.81] 


0.13 


NGC 628 


0.65 


[0.45, 0.85] 


0.05 


NGC 3184 


0.84 


[0.72, 0.97] 


0.05 


NGC 4736 


0.86 


[0.59, 1.10] 


0.09 


Group Parameters 


0.79 


[0.58, 1.0] 


0.14 



In this case, we need another level for the distribution of 
Xco'- 



Xco ~ W(minXco, maxXco), 



(33) 



where U is a uniform distribution between minXco and 
maxXco- We have explored different extents for the ranges 
in Xco, including up to an order of magnitude, and find 
insignificant differences in the results, either on the indi- 
vidual or group level. The 2a ranges increases slightly, but 
the mean/median of the posterior does not change. This in- 
dicates that independent variations in Xco do not have a 
strong influence on the KS parameters for this sample. 

As an example, Table [8] shows the individual and 
group slopes and scatter terms from the sample where 
Xco is allowed to vary from minXco=0.5x 10 20 and 
maxXco=8x 10 20 cm -2 K _1 km -1 s, allowing for over an 
order of magnitude variation in Xco- The median of the 
group slope is 0.79, and the 2a interval is 0.58 to 1.0, which 
is slightly larger than the range shown in Table [7] There are 
hardly any differences between these results and the param- 
eter estimates when Xco is fixedEI One difference is that 
the slopes are systematically lower by ,$ 0.1. Yet, there is 
significant overlap between the results, considering the full 
95% intervals, so there is no strong evidence of a decrease in 
slopes. Further, the a scat values are similar between the two 
fits, again suggestive that varying Xco (by about an order 
of magnitude) does not produce tighter KS relationships. 
Similarly, the hierarchical Bayesian fit does not provide any 
evidence for a preferred value of Xco- 

For this initial investigation into the effect of uncer- 
tainty in the conversion factor, we assume that Xco does 
not correlate with any other property. This preliminary 
analysis has only allowed for a modest variation in Xco, 
with no constraint for each datapoint beyond the range set 
by conditional relationship (|33[) . Recent theoretical efforts 
have indicated that Xco indeed varies with other physi- 
cal properties, such as metallicity ()Glover fc Mac Lowll201ll : 
IShettv et alj l2011a|j. gas temperature, vel ocity dispersion 
l|Naravanan et al. 2011; Shetty et al. 2011b) or with molec- 
ular gas content ( Feldmann et alj 2012bl ; Narayanan et all 
l2012f ). Observational investigations are also finding ev- 
idence for variable Xco in individual galaxies (e.g. 

12 The model with fixed Xqo can be considered a special case of 
the more general model allowing for Xco variations, but with a 



delta function on 2xl0 20 



K~ 



km s as the prior. 
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ISandstrom et al1l2012l ). Such systematic variations would re- 
sult in different KS paramete rs (see, e.g. lOstriker fc Shettvl 
l201ll ; iNaravanan et all l2012h . We will investigate how the 
KS relationship is affected by these systematic variations 
in Xco in future work. To improve the parameter esti- 
mates, it would be helpful to increase the number of sample 
galaxies and to incorporate uncertainties in converting IR 
and UV luminosities to Esfr. We can further explore other 
parametrizations in Xco, e.g. as a function of 7co, and ei- 
ther for individual galaxies or for the population. Such mod- 
elling will allow for a study into the correlation between the 
parameters, e.g. Xco with E mo i, in addition to fully treating 
measurement and conversion factor uncertainties. 



iBlanc et all ((2009) create numerous Esfr — E mo i grids, con- 
sidering different model KS parameters, a scatter term, and 
different realizations of noise. Then, they assess the x 2 be- 
tween all the model grids and that produced from obser- 
vations of M51. The distribution of \ 2 values identifies the 
best-fit parameters and their uncertainties. As this Monte 
Carlo approach considers the scatter term as well as treats 
uncertainties, it should be more accurate than the typical 
OLS bisector. It has also recently been utilized in Leroy et 
al. (2012, Submitted) on the HERACLES sample. It will be 
interesting to ascertain whether such a method can accu- 
rately recover the individual and group parameters from a 
sample of galaxies, and how it compares to the hierarchical 
Bayesian method presented here. 



6 DISCUSSION & SUMMARY 

6.1 Advantages of hierarchical Bayesian modelling 

We have demonstrated that the Bayesian method described 
in Section[2]can accurately recover the regression parameters 
from a sample containing hierarchical structure. Hierarchical 
data refers to any set with numerous measurements from 
individuals within a group (e.g. measurements from a sample 
of galaxies, with a number of measurements per galaxy). The 
method self-consistently treats measurement uncertainties, 
so that the resulting parameter estimates are PDFs which 
account for any uncertainty in the modelling process. 

The Bayesian method is shown to be superior to the 
commonly employed OLS estimates that usually do not con- 
sider the hierarchical structure in the dataset. Besides ac- 
counting for the measurement uncertainties, the hierarchical 
nature of the dataset is preserved, as the Bayesian method 
simultaneously estimates the parameters of both the indi- 
viduals as well as the population. Under this framework, 
differences between individuals are clearly exposed, and the 
range in plausible group parameters self-consistently permits 
for variations between individuals. 

Another key aspect of the hierarchical Bayesian method 
is the robust estimation of the scatter cr sca t about the fit re- 
gression line, for both individuals subjects as well as the 
group. When assessing the KS relationship from observa- 
tions, there is a tren d of decreasing g a c at with increas- 
ing aperture size (e.g. iTh 



lker et all [20071 : iKennicutt et all 



120071 : iRahman et afll201ll ). Theoretical efforts have con- 
sidered cr sca t and its relationship to sampling size. It 
may be due to variable sampling o f the underlying cloud 
mass spectrum ijCalzetti et alJl2012fl. influence on averag- 
ing times cale, metallicitv Jpibl " 20111 ). background radia- 
tion field ( Fcldman net a 1.11201 lli . or simply transient effects 
(e.g. IShettv fc Ostrikerll200Sl ). With credible knowledge of 
the measurement uncertainties (e.g. <r mo i and ctsfr), the 
Bayesian method will provide accurate estimates of the in- 
trinsic scatter in the KS relationship. Future work consider- 
ing the possibility of variable conversion factors for deriving 
Esfr and E mo i will provide even more accurate estimates 
of the intrinsic scatter, as the scatter due to the systematics 
can be robustly separated from the intrinsic scatter. These 
estimates of the intrinsic scatter can thereafter be used for 
quantitative comparisons with theoretical results. 

Recently, new statistical methods have been considered 
beyond the bisector to estimate the KS parameters. Namely, 



6.2 The KS relationship in galaxies 

As discussed in the introduction, the KS law is a widely in- 
vestigated relationship between the star formation rate and 
molecular gas surface density. The initial works using galaxy 
wide averages estimated N~1.5. Follow up work using back- 
ground subtracted em ission from resolved o bservations by 
IKennicutt et al.l |2007h and lLiu et alj (|201lf ) recovered this 
result. On the other hand, kpc-scale observations of a large 
sam ple of galaxies hav e inferred slopes cl oser to unity (e.g. 
B08, iBigiel et al.ll201ll; iLerov et al.ll2008l , Leroy et al. 2012 



Submitted. IRahman et al.ll2012r iT 

After verifying the accuracy of the hierarchical Bayesian 
method, which models how the mean value of log(EsFR) 
depends on log(E mo i), we applied the method to estimate 
the parameters of the KS relationship in the sample of disk 
galaxies from B08. The fitting results from both the test 
samples (Section [3} and the B08 sample suggest that previ- 
ous methods have overestimated the KS slopes. The hierar- 
chical Bayesian method produces slopes which are slightly 
lower than unity, especially when considering the full 95% 
confidence interval for some of the individual galaxies. 

Before interpreting our results from this analysis, we 
note a number of important caveats which should be kept 
in mind. First, as we have discussed throughout this work, 
we have directly considered Esfr and E mo i. Yet, these are 
not directly measured. The star formation rates and surface 
densities are estimated from the observables (e.g., CO, UV 
and IR intensities) using constant conversion factors. In the 
measurement model (Eqns. I4I5|) . we have only accounted 
for random statistical errors associated with observational 
noise. Yet, systematic errors may arise in the conversion be- 
tween intensities to Esfr and £ mo i. Under a hierarchical 
framework, uncertainties in these conversion factors can be 
naturally handled, including correlated errors between the 
various coefficients. In future work, we will self-consistently 
treat the uncertainties in these conversion factors for a larger 
sample of galaxies. This may reveal interesting new aspects 
of the star formation and gas properties in the ISM. 

Additionally, we have not considered the different ap- 
proaches to es timating star formatio n rates on ~kpc scales 
in galaxies. As 



Rah man et al 



Kennicutt et al.l (|2007f ). lLiu et al.l i|201ll '), and 



(2011 



demonstrate, subtracting off a diffuse 
component can have significant impact on the scaling rela- 
tionships: it may increase the KS slope, because the rela- 
tive contributi on of diffuse emiss ion is largest where Esfr is 
small (see also lLerov et al.ll2012h . A similar argument could 
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possibly be made regarding diffuse CO emission, which may 
not be directly related to recent star formation on ~kpc 
scales (|Rahman et al, 120111). B08 tre a t som e of these as- 
pects differently from Kennicutt et alj (|2007T ) and lLiu et al.l 
l|201lf ) and we refer the reader to these studies for further de- 
tails. We note that lLerov et al, | (|2012T l explore some of these 
effects, in particular the contribution from an evolved stel- 
lar population to the IR star formation tracer, and suggest 
that the uncertainty due to diffuse emission in the estimated 
index N is ~ 0.15 x TV. This may push some of the indices 
of individual galaxies, such as NGC 3184 and NGC 4736 
to values greater than unity. But, for the galaxies with the 
lowest indices, such as M51 and NGC 6946, the data would 
still favor a sub-linear KS relationship. 

One of the results from the hierarchical Bayesian fit on 
the B08 sample is that there are significant galaxy-to-galaxy 
variations in the slopes and intercepts, as already noted by 
B08. The comparison of the PDFs of parameter estimates 
between certain galaxies indicates that to a very high de- 
gree of confidence galaxies have different KS parameters. 
For instance, at > 95% M51 and NGC 3184 have different 
intercepts. Similarly, the slopes are also substantially differ- 
ent, as the posterior median value for NGC 3184, 0.92, is 
ruled out for M51, though there is a very low probability 
that both galaxies have slopes between 0.79 and 0.83. 

The significant galaxy-to-galaxy variation leads to a 
group distribution of the slope with a large dispersion, ~ 
0.4. This wide distribution suggests that there is likely no 
single, or "universal" KS law that is applicable over all disk 
galaxies. This result supports the idea that other physical 
properties, such as metallicity, gas fraction, stellar mass, 
turbulence levels, magnetic fields, or galaxy environment, 
among other factors, affect the star forming properties of a 
galaxy besides the gas surface density. R ecent observational 
results support this description (e.g. ILerov et al.l 120091; 
Shi et al. | boill ; ISaintonge et al.l 1201 ll ; ISandstrom et ail 



20121 . Leroy et al. 2012 Submitted). Theoretical efforts are 



also considering the impact of other physical processes 
besides just th e gas s u rface density (e.g. IStinson et al.ll2006l; 
Ostriker et~aH l2010l: iKim et al.l l201ll: lOstriker fc Shettyl 
20111; iKrumholz et all |2012|; Ishettv fe Ostrikerl |2012| ; 
Glover fc Clarkll2012l ; iFederrath fe Klessenll2012T K 



The results from the hierarchical Bayesian fit on the 
seven spiral galaxies has produced slopes which are system- 
atically lower than previous analyses. The posterior median 
values for all seven individual galaxies are all less than unity. 
In four of the seven galaxies, NGC 5194, NGC 5055, NGC 
6946, and NGC 628, the slope estimates are well below unity, 
even at the 2a level. For M51 (NGC 5194), the posterior me- 
dian of the slope is 0.72, with values ^ 0.83 ruled out with 
95% confidenceEl The sub-linear slope in M51 was even es- 
timated by B08 with the bisector. Moreover, using Ha obser- 
vations, and a Mo nte Carlo analysis of the KS parameters, 
iBlanc et all (120091 ) estimated N = 0.82 ± 0.05. Taken to- 
gether, these results strongly point to a sub-linear KS slope 
for M51. 

The Bayesian inferred group slope is 0.84, with 2a range 



[0.63, 1.0]. This range is consistent with the variable esti- 
mated slopes from the individual galaxies. Future work on a 
larger sample is needed to confirm this 2a result. One inter- 
pretation of a sub-linear relationship is that the gas surface 
density estimated from the detected CO emission is not all 
associated with star formation. The plausible interpretation 
is that at higher CO luminosities, the relative fraction of 
diffuse molecular gas that is not associated with star for- 
mation increases. The KS index may provide a quantitative 
measure of this fraction. 

A sub-linear slope also suggests that the efficiency of 
star formation (here defined as the star formation rate nor- 
malized to H2 mass) decreases for increasing gas surface den- 
sity. Equivalently, the computed gas depletion time is not 
constant, and increases with increasing gas surface density. 
Of course, this depletion time refers to all the gas that is 
detected, and if more non-star forming gas is contributing 
to the observed emission, then the depletion time calcula- 
tion may include superfluous gas not associated with star 
formation. This is consistent with the picture that at high 
CO intensities, relatively more diffuse gas is contributing to 
the emission compared to lower intensity regions, where CO 
is only tracing dense gas. Yet, star formation only occurs in 
the dense most (gravitationally-bound) regions of the ISM, 
regardless of whether CO is present in dense or diffuse gas. 

We have also applied a Bayesian regression fit to the 
publicly available star for mation rates and gas su rface den- 
sities in M51 inferred by Kennic utt et al.l (2007). In their 
investigation, iKennicutt et al. ( 2007h measure Esfr by sub- 
tracting a background from the 24 urn measurements, as- 
suming that some fraction of the dust is heated by an older 
stellar population and thus is not related to recent star 
formation. From our analysis, we estimate slopes (at 95% 
confidence) in the range [1.25, 1.59] for the 13" (520 pc) 
data, and [0.80, 1.22] from the 45" (1850 pc) data. These 
slopes are larger than the range obtained from the analysis 
on the B08 sample, as well as the M51 study bv lBlanc et alJ 
(2009). This discrepancy is possibly due to methodological 
differences, e.g. a diffuse, non-star formation related com- 
ponent in the IR emission (see discussion above), a topic 
that is currently ext e nsively under deb ate l|Liu et al.ll201ll : 
iRahman et~ai1 |2012| ; ILerov et aLll2012f l. Nevertheless, the 
Bayesian res ults are rather dif f erent from the KS parameters 
estimated in IKennicutt et all (|2007T j. in that there is a sig- 
nificant difference between the 13" and 45" data 

E 

Namely, 

the KS index decreases with increasing beam size. A de- 
crease in slope with beam size would be consistent with the 
interpretation offered above for the sub-linear slopes in the 
B08 sample. In the CO bright ISM, if there is a contribution 
from diffuse or dense gas not associated with star formation, 
then increasing the sampling area might also increase the 
contribution from this component to £ mo i, without a cor- 
responding increase in Esfr. This scenario would lead to a 
systematic decrease of the KS index with increasing beam 

sizes. 

Supporting this description, lElmegreenl (|l993T l postu- 



13 In fact, for M51, the maximum slope in the posterior is 0.97, 
ruling out a linear slope under the hierarchical Bayesian frame- 
work. 



14 IKennicutt et al] l|2007h e mploy a bi-linear (FIT EXY) fitting 
routine, and as discussed by ICalzetti et all (l2012h , the inferred 
index can be dependent on the fitting method, as we ll as the size 
of the sampled regions (see also lRahman et al]|20lih . 
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lated the presence of diffuse molecular clouds. In the ISM 
where the pressure is sufficiently high, H2 molecules can 
form in re g ions w hich are not necessarily self-gravitating. 
lElmegreenl l| 19931 ) suggests that due to local and transient 
variations in the ISM, a high pressure region may form 
molecules. Similarly, the diffuse molecular gas may be re- 
turned to the atomic phase before it becomes self-gravitating 
due to a decrease in pressure or an increase in UV radiation 
(e.g. from nearby star formation). Thus, at any given time 
there may be some fraction of the ISM which is in molec- 
ular form but is returned to the atomic phase before star 
formation occurs in that diffuse molecular component. 

Whether the sub-linearity in the KS index is a sign of 
such non-star forming molecular gas remains to be tested. 
One important consideration is whethe r CO can faithfully 
trace th is diffuse molecular component. Idover fc Mac Lowl 
<|2007bh showed that due to effective self-shielding, H2 can 
exist in diffuse regions, whereas CO is easily photodissoci- 
ated. In these regions, CO emission is not a reliable tracer of 
molecular gas. As a result Xco can vary depending on en- 
vironment (iGlover fc Mac LowllioTH ; IShettv et aLll2011al lbh . 
Yet, in the denser regions of a Galaxy, e.g. towards the cen- 
tre, the ISM pressure and density may be sufficiently high, 
allowing CO to form alongside H2 in clouds which are not 
self-gravitating nor star forming, thereby resulting in the 
sub-linear KS relationship we find here. 

Our analysis has focused on the relationship between 
star formation and CO traced gas on approximately kpc- 
scales in external galaxies. Accordingly, the interpretation 
that a sub-linear KS relationship is a sign of molecular gas 
unassociated with star formation can only be applicable on 
those large scales. Detailed analysis on smaller scales should 
also be conducted to verify the prese nce of such gas. Indeed , 
the recent observational analysis by lLongmore et all (|2012T ) 
has indicated that the relationship between the star forma- 
tion rate and gas density in the Central Molecular Zone 
(CMZ) within 500 pc from the Galactic Centre is dis- 
crepant from that inferred in the main disc. Namely, it ap- 
pears that the star formation rate is about an order of mag- 
nitude lower than expected given the high gas densities in 
the CMZ, compared to a linear KS relationship inferred from 
previous investigations. The CMZ environment is rather dif- 
ferent than the main disc of the Milky Way, in that the 
mean density and molecular fractions are sig nificantly higher 
jBallv et al.ll 19871 ; iMorris fc Serabvnlll996r), as well as the 
turbulent velocities (|Ballv et al.l Il987l; iMivazaki fc Tsuboil 
l200d ; lOka et al.l Il998l. l200ll; IShettv et al.l I2012T ). As sug^ 
gested by Longmore et al.l ( 20121 ') . these environmental dif- 
ferences may contribute to the discrepant star formation 
behavior. Such variable ISM environments, including the 
higher fraction of molecular gas towards galactic centers, 
may also be extant in external galaxies, and contribute to 
the sub-linear KS relationship we find here. 

The sub-linear KS relationship we estimate for some in- 
dividual galaxies is only applicable in the ISM where E mo i ^ 
100 Mq pc -2 . As originally discussed by B08, there is evi- 
den ce for a steeping in t he KS ind ex at higher surface densi - 
ties. lOstriker &: Shettvl (|201lh and lShettv fc Ostrikeij (|2012T ) 
postulated a KS relationship with TV ~2 for the molecular 
dominanted ISM of starbursts if supernovae driven turbu- 
lent pressure balances the vertical weight of the disk, leading 
to the self-regulation of star formation. lOstriker fc Shettvl 



(2011) showed that if the Xco factor various continuously 
with surface density, then the best fit KS relationship to an 
observe d sample of starbursts by iGenzel et al.1 l^Old ) has 
N « 2. iNaravanan et al.l (|2012T ) confirmed this prediction, 
using Xcooc Icq 3 , a result based on numerical modeling 
of a suite of galaxy simulations. The apparent break in the 
KS relationship between the starbursts, with super-linear 
slopes, and more quiscent ISM, for which this work has found 
TV < 1 for some individual galaxies, may be indicative of fun- 
damemtal differences in the properties of the ISM between 
the regimes. One possibility is that the relative amount of 
dense and diffuse molecular gas may differ between these 
regimes. 

These ideas will have to be further scrutinized quanti- 
tatively, both theoretically and observationally. Recent the- 
oretical models are now capable of tra cking the formation 
of H2 an d CO in ISM simu lations (e.g. IGlover fc Mac Lowl 
l2007al lbl; IGlover et ahlboiCh . Simulations of colliding flows 
with time-dependent chemistry show that CO forms in 
clouds more r apidly and more p ervasively than the forma- 
tion of stars jClark et al.l 12013 ). Extensions of that work 
are showing that the star formation rate does not increase 
as rapidly with CO abundance in the most massive clouds, 
compared to the trend at lower masses and lower CO den- 
sities (P. Clark et al. In Prep). This scenario is consistent 
with a sub-linear KS index due to the presence of CO in gas 
which is not later converted into stars. 

An extension of the hierarchical Bayesian analysis per- 
formed here, including a larger sample and varying sampling 
beams, may be needed to confirm these trends. Hierarchi- 
cally assessing the KS parameters may quantify the fraction 
of molecular gas not associated with star formation in indi- 
vidual galaxies. Given the variation in KS relationships in 
individual galaxies, we expect to find differences in the dif- 
fuse molecular gas fraction from galaxy to galaxy. Explicitly 
accounting for uncertainties in estimating Esfr and E mo i 
may also reveal the correlations between various conversion 
factors, such as Xco and the conversion from IR luminos- 
ity to Esfr- The hierarchical Bayesian approach is ideally 
suited for these analyses. In future work, we will deploy the 
hierarchical model on larger datasets, which should further 
advance our understanding of star formation in the ISM. 



6.3 Summary of results 

We have introduced a Bayesian linear regression method for 
inferring the parameters of the KS law, which consistently 
treats measurement uncertainties, as well as the hierarchical 
structure of datasets. After demonstrating the accuracy of 
the method on synthetic datasets, we applied the method 
to estimate the KS parameters from observations of disk 
galaxies by B08. 

Our main results are as follows: 

1. A hierarchical Bayesian method is well suited for lin- 
ear regression of structured data with measurement uncer- 
tainties, e.g. numerous measurements of individuals within 
a group. The method simultaneously fits the regression pa- 
rameters of each individual and the group, and provides 
PDFs of the slope, intercept and scatter terms. We demon- 
strate that the posterior, which includes well defined un- 
certainty estimates, accurately recovers the parameters of 
synthetic hierarchical data (Sections [2] and [3]). 
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2. We compared the Bayesian result with the OLS (y\x), 
(x\y), and bisector fits on the synthetic datasets. From the 
test on synthetic datasets, the Bayesian method provides the 
most accurate estimates of underlying parameters for both 
individuals and the population. The commonly employed bi- 
sector usually overestimates the slopes and underestimates 
the intercept. We discuss that the reason for the discrepan- 
cies is that OLS methods provide different summaries of the 
joint distribution. We therefore recommend against the use 
of the bisector for linear regression of data with measure- 
ment uncertainties (Section [3]). 

3. When applied to observed data of spiral galaxies in 
B08 to estimate the KS parameters, we obtained slopes that 
are lower than previous results. In four of the seven galax- 
ies, NGC 5194, NGC 5055, NGC 6946, and NGC 628, the 
slope estimates are well below unity, even at the 2a level. 
For NGC 5194, the posterior median of the slope is 0.72. 
For the group slope and intercept, the posterior median is 
0.84 and -3.00, with 2a range [0.63, 1.0] and [-3.3,-2.7], 
respectively. The posterior median of the intrinsic scatter, 
assuming 25% and 50% uncertainty in E mo i and Esfr, re- 
spectively, is 0.14 dex. Previous bisector results on the en- 
semble overestimated the slopes, as confirmed by the large 
discrepancy in the OLS(EsFFt|E mo i) and OLS(E mo i|Espa) 
estimates. We also noted that a direct Bayesian linear re- 
gression on the ensemble provides limited ranges in the KS 
parameters, thereby attesting to the necessity for treating 
the hierarchical nature of datasets in order to clearly iden- 
tify the differences between individual galaxies (Section [5}. 

4. A sub-linear KS relationship or a decreasing KS in- 
dex with increasing beam size for some individual galaxies 
may be indicative of molecular gas which is not forming 
stars. At low densities or metallicities, CO bright regions 
may be directly associated with star formation. But at in- 
creasingly higher CO luminosities, diffuse or dense molecular 
gas not associated with star formation may be contributing 
to the observed emission. This situation would correspond- 
ingly result in a sub-linear KS relationship, so that the gas 
depletion time is not constant but rather increases with CO 
traced E mo i. As we find significant variation from galaxy- 
to-galaxy, this scenario may be applicable to some galaxies, 
e.g. those with high molecular gas fractions, but not others 
(Section 

To improve our analysis, in future work we will consider 
a larger sample, include a treatment of uncertainties in the 
conversion factors and star formation rate calibrations (ef- 
fects of diffuse emission not related to star formation), and 
explicitly consider the effects of other physical properties 
of the source. As the hierarchical Bayesian framework pre- 
sented here is well-suited for treating any source of uncer- 
tainty, it can naturally handle variations in the conversion 
factors, and may provide additional insights into the prop- 
erties of star formation in the ISM. 
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Table 3. OLS 1 estimated parameters for Test Group A 



Subject True A True N OLS(£sFFt|£moi) A OLS(E SF R|E mo i) N OLS(E mo i|E SFR ) A OLS(E mo i|E S F R ) N Bisector A Bisector N 



Test Galaxy Al 


-2.77 


0.72 


-2.75 


± 


0.11 


0.71 


± 


0.08 


-2.89 


± 


0.29 


0.83 


± 


0.15 


-2.82 


0.77 


Test Galaxy A2 


-3.21 


0.88 


-3.23 


± 


0.13 


0.86 


± 


0.10 


-3.39 


± 


0.27 


0.99 


± 


0.12 


-3.31 


0.92 


Test Galaxy A3 


-3.18 


0.89 


-3.13 


± 


0.14 


0.87 


± 


0.11 


-3.31 


± 


0.26 


1.03 


± 


0.12 


-3.22 


0.95 


Test Galaxy A4 


-2.81 


0.78 


-2.88 


± 


0.13 


0.79 


± 


0.10 


-3.06 


± 


0.27 


0.95 


± 


0.14 


-2.96 


0.87 


Test Galaxy A5 


-2.87 


0.74 


-2.88 


± 


0.13 


0.73 


± 


0.10 


-3.07 


± 


0.32 


0.90 


± 


0.15 


-2.97 


0.81 


Test Galaxy A6 


-3.22 


0.91 


-3.12 


± 


0.13 


0.86 


± 


0.10 


-3.28 


± 


0.24 


1.00 


± 


0.11 


-3.20 


0.93 


Test Galaxy A7 


-2.82 


0.92 


-2.75 


± 


0.14 


0.87 


± 


0.11 


-2.92 


± 


0.22 


1.03 


± 


0.12 


-2.92 


0.95 


Group Parameters 


-2.98 


0.83 


-2.96 


± 


0.06 


0.81 


± 


0.05 


-3.22 


± 


0.11 


1.04 


± 


0.05 


-3.1 


0.92 



1 2a uncertainties are provided for OLS(SgpR |S mo i) and OLS(£ mo i|EsF R ) estimated parameter. 



Table 4. OLS 1 estimated parameters for Test Group B 



Subject True A True N OLS(E S F R |E mo i) A OLS(E S jm|£ mo i) N OLS(E mo i|E S F R ) A OLS(E mo i|E S F R ) N Bisector A Bisector N 



Test Galaxy Bl 


-2.29 


0.84 


-2.26 


± 


0.12 


0.82 


± 


0.09 


-2.40 


± 


0.17 


0.93 


± 


0.12 


-2.33 


0.88 


Test Galaxy B2 


-2.53 


0.92 


-2.55 


± 


0.13 


0.90 


± 


0.10 


-2.70 


± 


0.18 


1.03 


± 


0.11 


-2.62 


0.96 


Test Galaxy B3 


-2.15 


0.96 


-2.46 


± 


0.14 


0.94 


± 


0.11 


-2.64 


± 


0.16 


1.09 


± 


0.11 


-2.55 


1.02 


Test Galaxy B4 


-2.26 


0.92 


-2.32 


± 


0.14 


0.93 


± 


0.10 


-2.47 


± 


0.15 


1.06 


± 


0.10 


-2.39 


0.99 


Test Galaxy B5 


-2.33 


1.00 


-2.33 


± 


0.13 


0.99 


± 


0.10 


-2.48 


± 


0.13 


1.12 


± 


0.09 


-2.40 


1.05 


Test Galaxy B6 


-2.54 


1.12 


-2.44 


± 


0.13 


1.07 


± 


0.10 


-2.57 


± 


0.11 


1.18 


± 


0.08 


-2.51 


1.12 


Test Galaxy B7 


-2.12 


0.95 


-2.03 


± 


0.14 


0.90 


± 


0.10 


-2.21 


± 


0.13 


1.05 


± 


0.11 


-2.12 


0.98 


Group Parameters 


2.37 


0.96 


2.34 


± 


0.06 


0.93 


± 


0.05 


-2.57 


± 


0.06 


1.13 


± 


0.04 


2.45 


1.03 



1 2<r uncertainties are provided for OLS(Ssfr |S mo i) and OLS(£ mo i|£sFR.) estimated parameters. 



Table 5. Adopted and Bayesian inferred parameters for Test Group C 



Subject 


Datapoints 


True A 


True N 


True Scatter 1 


Bayes A 


Bayes 2a a 


Bayes N 


Bayes 2crjv 


Bayes <7 scat 


Test Galaxy CI 


5 


-3.00 


1.50 


0.3 


-2.63 


[-3.00, 


-2.33] 


1.22 


[1.00, 1.49] 


0.17 


Test Galaxy C2 


7 


-2.95 


1.46 


0.3 


-2.58 


[-2.90, 


-2.30] 


1.22 


[1.01, 1.47] 


0.17 


Test Galaxy C3 


9 


—2.89 


1.42 


0.3 


-2.60 


[-2.93, 


-2.34] 


1.21 


[1.00, 1.46] 


0.17 


Test Galaxy C4 


11 


-2.84 


1.37 


0.3 


-2.63 


[-2.96, 


-2.37] 


1.19 


[0.99, 1.43] 


0.17 


Test Galaxy C5 


13 


-2.79 


1.33 


0.3 


-2.58 


[-2.88, 


-2.33] 


1.20 


[1.00, 1.43] 


0.17 


Test Galaxy C6 


15 


-2.74 


1.29 


0.3 


-2.54 


[-2.82, 


-2.30] 


1.12 


[0.93, 1.32] 


0.17 


Test Galaxy C7 


17 


-2.68 


1.25 


0.3 


-2.55 


[-2.81, 


-2.31] 


1.17 


[0.98, 1.37] 


0.17 


Test Galaxy C8 


19 


-2.63 


1.21 


0.3 


-2.58 


[-2.83, 


-2.34] 


1.15 


[0.97, 1.34] 


0.17 


Test Galaxy C9 


21 


-2.58 


1.16 


0.3 


-2.59 


[-2.77, 


-2.30] 


1.14 


[0.97, 1.33] 


0.17 


Test Galaxy CIO 


23 


-2.53 


1.12 


0.3 


-2.48 


[-2.72, 


-2.26] 


1.07 


[0.90, 1.25] 


0.17 


Test Galaxy Cll 


25 


-2.47 


1.08 


0.3 


-2.45 


[-2.69, 


-2.24] 


1.10 


[0.93, 1.29] 


0.17 


Test Galaxy C12 


27 


-2.42 


1.04 


0.3 


-2.43 


[-2.66, 


-2.22] 


1.07 


[0.91, 1.24] 


0.17 


Test Galaxy C13 


29 


-2.37 


0.99 


0.3 


-2.41 


[-2.63, 


-2.20] 


1.04 


[0.88, 1.21] 


0.17 


Test Galaxy C14 


31 


-2.32 


0.95 


0.3 


-2.32 


[-2.54, 


-2.13] 


0.99 


[0.83, 1.15] 


0.17 


Test Galaxy C15 


33 


-2.26 


0.91 


0.3 


-2.31 


[-2.51, 


-2.11] 


0.93 


[0.78, 1.09] 


0.17 


Test Galaxy C16 


35 


-2.21 


0.87 


0.3 


-2.31 


[-2.54, 


-2.11] 


0.94 


[0.78, 1.12] 


0.17 


Test Galaxy C17 


37 


-2.16 


0.86 


0.3 


-2.21 


[-2.42, 


-2.02] 


0.89 


[0.74, 1.04] 


0.17 


Test Galaxy C18 


39 


-2.11 


0.78 


0.3 


-2.20 


[-2.40, 


-2.02] 


0.86 


[0.72, 1.01] 


0.17 


Test Galaxy C19 


41 


-2.05 


0.74 


0.3 


-2.18 


[-2.37, 


-2.00] 


0.83 


[0.68, 0.98] 


0.17 


Test Galaxy C20 


43 


-2.00 


0.70 


0.3 


-2.14 


[-2.33, 


-1.96] 


0.79 


[0.65, 0.95] 


0.17 


Group C Parameters 


480 


2.5 


1.1 


0.3 


2.43 


[-2.57, 


2.31] 


1.06 


[0.95, 1.17] 


0.17 



For this dataset, the intrinsic scatter is uniformly distributed. Values refer to the width of the distribution, centered on 0. 



Table 6. OLS 1 estimated parameters for Test Group C 



Subject 


Datapoints True A 


TUm AT 


OLS(E S pa|E mol ) A 


OLS(E SF r| 


V \ AT 


OLS(E mol |E SP R) A 


OLS(E mol |! 


*\ at 


Bisector A 


Bisector N 


Test Galaxy CI 


u 


— 3 oo 

o.uu 


1.50 


—3 i n 

O. -LU 




n i 3 

u. io 


1 53 




0.10 


— 3 1 




07 


1 54 




0.04 


-3.10 


1.53 


Test Galaxy C2 


7 


— 9 Q5 
zj. yu 


1.46 


— 9 83 


-|- 


n 90 

u.^u 


1 3Q 




0.15 


— 9 85 




0.12 


1.41 




0.08 


-2.85 


1.40 


Test Galaxy C3 


Q 
■J 


9 8Q 


1.42 


9 88 


4- 


n i fi 


1 41 


4- 


0.12 


9 QO 


4. 


10 


1 43 


4. 


0.12 


-2.89 


1.42 


Test Galaxy C4 


1 1 


— 2.84 


1.37 


— 2.84 


-j- 


n 1 5 


1 33 




0.11 


— 9 87 




OQ 


1 35 




0.05 


-2.86 


1.34 


Test Galaxy C5 


13 


—2.79 


1.33 


—2.76 


-J- 


0.12 


1.34 


-J- 


0.09 


—2.78 


-J- 


0.08 


1.35 


-J- 


0.05 


-2.77 


1.34 


Test Galaxy C6 


1 r > 

-LU 


— 9 74 


1.29 


—9 fin 




n i 3 

U. IO 


1 1 5 


-j- 


0.10 


—2.62 


-j- 


1 1 

U. 1 -L 


1 1 8 


-j- 


0.07 


-2.61 


1.16 


Test Galaxy C7 


1 7 


9 fi8 


1.25 


9 fi3 


4- 


17 


1 99 


4. 


0.13 


9 fiQ 


4. 


19 


1 97 


4- 


0.08 


-2.66 


1.25 


Test Galaxy C8 




— 9 fi3 


1.21 


— 9 fiQ 




1 1 


1.23 


-|- 


0.09 


— 2.72 




08 
u.uo 


1.25 


-|- 


0.06 


-2.71 


1.24 


Test Galaxy C9 


21 


-2.58 


1.16 


-2.56 


± 


0.10 


1.17 


± 


0.07 


-2.59 


± 


0.07 


1.19 


± 


0.05 


-2.57 


1.18 


Test Galaxy CIO 


23 


-2.53 


1.12 


-2.48 


± 


0.09 


1.06 


± 


0.07 


-2.51 


± 


0.09 


1.09 


± 


0.06 


-2.50 


1.08 


Test Galaxy Cll 


25 


-2.47 


1.08 


-2.44 


± 


0.11 


1.09 


± 


0.09 


-2.49 


± 


0.10 


1.13 


± 


0.07 


-2.47 


1.11 


Tp«t HfllaYv PI 9 

leal VjrdltxA.y v>_L^ 


27 


-2.42 


1.04 


-2.41 


± 


0.10 


1.05 


± 


OS 
u.uo 


-2.45 


± 


0.09 


1.08 


± 


07 
u.u / 


— 2.43 


1 07 

-L .U 1 


Test Galaxy C13 


29 


-2.37 


0.99 


-2.37 


± 


0.11 


1.01 


± 


0.08 


-2.42 


± 


0.11 


1.05 


± 


0.08 


-2.40 


1.03 


Test Galaxy C14 


31 


-2.32 


0.95 


-2.25 


± 


0.10 


0.93 


± 


0.08 


-2.31 


± 


0.11 


0.98 


± 


0.08 


-2.28 


0.95 


Test Galaxy C15 


33 


-2.26 


0.91 


-2.22 


± 


0.08 


0.86 


± 


0.06 


-2.26 


± 


0.10 


0.89 


± 


0.08 


-2.24 


0.88 


Test Galaxy C16 


35 


-2.21 


0.87 


-2.23 


± 


0.08 


0.88 


± 


0.07 


-2.28 


± 


0.11 


0.92 


± 


0.08 


-2.26 


0.91 


Test Galaxy C17 


37 


-2.16 


0.86 


-2.10 


± 


0.09 


0.80 


± 


0.07 


-2.16 


± 


0.12 


0.85 


± 


0.10 


-2.13 




Test Galaxy C18 


39 


-2.11 


0.78 


-2.09 


± 


0.07 


0.77 


± 


0.05 


-2.12 


± 


0.11 


0.80 


± 


0.08 


-2.10 




Test Galaxy C19 


41 


-2.05 


0.74 


-2.06 


± 


0.07 


0.74 


± 


0.05 


-2.11 


± 


0.12 


0.78 


± 


0.10 


-2.08 




Test Galaxy C20 


43 


-2.00 


0.70 


-2.01 


± 


0.07 


0.69 


± 


0.05 


-2.06 


± 


0.13 


0.73 


± 


0.10 


-2.04 




Group C Parameters 


480 


2.5 


1.1 


-2.34 


± 


0.03 


0.98 


± 


0.03 


-2.44 


± 


0.03 


1.06 


± 


0.03 


2.39 


1.® 



3 



O 

c 

1 la uncertainties are provided for OLS(£sFR|S mo i) and OLS(S mo i|EgFR) estimated parameters. j^j 



